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Abstract 

We work in the context of the infinitely many alleles model. The allelic partition 
associated with a coalescent process started from n individuals is obtained by plac- 
ing mutations along the skeleton of the coalescent tree; for each individual, we trace 
back to the most recent mutation affecting it and group together individuals whose 
most recent mutations are the same. The number of blocks of each of the different 
possible sizes in this partition is the allele frequency spectrum. The celebrated 
Ewens sampling formula gives precise probabilities for the allele frequency spec- 
trum associated with Kingman's coalescent. This (and the degenerate star-shaped 
coalescent) are the only A-coalescents for which explicit probabilities are known, 
although they are known to satisfy a recursion due to Mohle. Recently, Berestycki, 
Berestycki and Schweinsberg have proved asymptotic results for the allele frequency 
spectra of the Beta(2 — a, a) coalescents with a £ (1,2). In this paper, we prove 
full asymptotics for the case of the Bolthausen-Sznitman coalescent. 



1 Introduction 



1.1 Exchangeable random partitions 

In recent years, the topic of exchangeable random partitions has received a lot of atten- 
tion (see Pitman [35] for a lucid introduction). A random partition of N is said to be 
exchangeable if, for any permutation o : N — > N such that o~(i) = i for all % sufficiently 
large, we have that the distribution of the partition is unaffected by the application of a. 
It was proved by Kingman [2EJ EH] that if the partition has blocks {Bi,i > 1) listed in 
increasing order of least elements then the asymptotic frequencies, 

def |^n{l,2,...,n}| . 
fi = km ,2 > 1, 

n^oo n 
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exist almost surely. Let (/$ )»>i be the collection of asymptotic frequencies ranked in de- 
creasing order. Then we can view (//)»> 1 as a partition of [0, 1] into intervals of decreasing 
length. In general, since it is possible that ^i>i // < 1> there will also be a distinguished 
interval of length 1 — Yli>i fi- Consider now the following paintbox process, which creates 
a random partition of N starting from the frequencies. Take independent uniform random 
variables Ui, U 2 , ... on [0, 1]. If Ui and Uj land in the same non-distinguished interval of 
the partition then assign i and j to be in the same block. If Ui lands in the distinguished 
interval, assign i to a singleton block. The partition we create in this way is exchangeable 
and has the same distribution as the partition with which we began. This procedure can 
also be thought of in terms of a classical balls-in-boxes problem with infinitely many 
unlabelled boxes, see in particular Karlin [27J and Gnedin, Hansen and Pitman [18] . 

There are several natural questions that we may ask about an exchangeable random 
partition restricted to the first n integers (or, equivalently, about the partition formed 
by the first n uniform random variables in the paintbox process). How many blocks does 
this partition have? How many blocks does it have of size exactly k, for 1 < k < n? Even 
in the absence of precise distributional information for finite n, can we obtain n — > oo 
limits for these quantities, in an appropriate sense? These questions have been studied 
for various classes of exchangeable random partitions and random compositions, see in 
particular the work of Gnedin, Pitman and co-authors: [H [T7J I2D1 EH E21 EH] • 

1.2 Coalescent process and allelic partitions 

In this paper, we study a particular exchangeable random partition which derives from 
a coalescent process. The origins of this partition lie in population genetics and we will 
now describe how it arises and give a brief review of the relevant literature. For large 
populations, genealogies are often modelled using Kingman's coalescent [30] • This is a 
Markov process taking values in the space of partitions of N (or [n] = {1, 2, . . . , n}), 
such that the partition becomes coarser and coarser with time. Whenever the current 
state has b blocks, any pair of them coalesces at rate 1, independently of the other blocks 
and irrespective of the block sizes. We start with a sample of genetic material from 
n individuals. Here, n is taken to be small compared to the total underlying popula- 
tion size. We imagine tracing the genealogy of the sample backwards in time from the 
present. Then the blocks of the coalescent process at time t correspond to the groups of 
individuals having the same ancestor time t ago (where time is measured in units of the 
total underlying population size). See Ewens [TB] or Durrett [H] for full introductions to 
this subject. In the population genetics setting, it is natural to introduce the concept of 
mutation into this model. One of the most celebrated results in this area is the Ewens 
Sampling Formula, which was proved by Ewens [15] in 1972. It concerns the infinitely 
many alleles model, in which every mutation gives rise to a completely new type. It says 
that if we take a sample of n genes subject to neutral mutation (that is, mutation which 
does not confer a selective advantage) which occurs at rate 8/2 for each individual, then 
the probability q(mi, m^, ■ ■ .) that there are rrij types which occur exactly j times is given 
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Figure 1: Left: a coalescent tree with mutations. Right: the sections of the tree relevant 
for the formation of the allelic partition. Note that from each individual we look back 
only to the last mutation, so that the second mutation on the lineage of 6 is ignored. The 
allelic partition here is {1}, {2, 3, 5}, {4, 7, 8}, {6}. If A^(n) is the number of blocks of 
size k when we start with n individuals, then we have Ni(8) = 2, N 2 (8) = 0, N 3 (8) = 2, 
iV 3 (8) = JV 4 (8) = • • • = iV 8 (8) = 0. 

by 



q(m 1 ,m 2 , 



(0)nTlL>iJ m ^! 



where (#) n j = 9(9 + 1) • • - (9 + n — 1) and we must have 3 m i = n - Another way 
of expressing this (due to Kingman [2H]) is to picture the coalescent tree associated with 
Kingman's coalescent and place mutations along the length of the skeleton as a Poisson 
process of intensity 9/2. For each individual, trace backwards in time (i.e. forwards in 
coalescent time) to the most recent mutation. Group together those individuals whose 
most recent mutations are the same; this gives the allelic partition. Then rrij is the 
number of blocks in the allelic partition containing exactly j individuals. 

It is natural to extend these ideas to more general coalescent processes. See Figure [D 
for an example of a general coalescent tree and its allelic partition. The A-coalescents 
are a class of Markovian coalescent processes which were introduced by Pitman [34J and 
Sagitov [3?]. Like Kingman's coalescent, they take as their state-space the set of partitions 
of [n] (or, indeed, of the whole set of natural numbers). Their evolution is such that only 
one block is formed in any coalescence event and rates of coalescence depend only on the 
number of blocks present and not on their sizes. Take A to be a finite measure on [0, 1] . 
In order to give a formal description of the coalescent, it is sufficient to give its jump 
rates. Whenever there are b blocks present, any particular k of them coalesce at rate 

pi 



\ b k = / x «-\\ _ x) b - k A(dx), 2 < k < b. 
Jo 

Note that, in contrast to Kingman's coalescent, here we allow multiple collisions; that 
is, we allow more than two blocks to join together. Kingman's coalescent is the case 
A(dx) = 5o{dx), where unit mass is placed at 0. The case A(dx) = dx, called the 
Bolthausen-Sznitman coalescent, was introduced by Bolthausen and Sznitman [7] in the 
context of spin glasses. It has many nice properties and appears to be more tractable 
than most A-coalescents. For example, its marginal distributions are known explicitly 
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It has been studied in some detail: see, for example, Pitman [33], Bertoin and Le 
Gall [5], Basdevant [2] and Goldschmidt and Martin [25J. 

Another subset of the A-coalescents which has recently been particularly studied is the 
Beta coalescents, so-called because A here is a beta density: 

A(dx) = 1 ^-"(l - xY- l dx, 

1 (2 — a)L (a) 

for some a G (0,2). (The a = 1 case is the Bolthausen-Sznitman coalescent and, in 
some sense, a = 2 corresponds to Kingman's coalescent.) See Birkner et al [B] for a 
representation in terms of continuous-state branching processes when a G (0,2). 

If we suppose that instead of Kingman's coalescent, the genealogy of the population 
evolves according to a general A-coalescent then, except in the special case of the de- 
generate star-shaped coalescent (where A(dx) = Si(dx)), there is no known explicit ex- 
pression for the probability q{m\,m2, ■ ■ ■) of having rrij blocks in the allelic partition of 
size j. However, Mohle |31j has shown that the probabilities q must satisfy the following 
recursion: 

n_ ^ ( ' U ) A n ~i ■/ i j\ 

q{m) = A^V (m - 6l) + £ Z + nT £ 3 -^T q[m + c ' ~ e ^ } ' 

i=i r j=i 

where A„ = Y^l =2 ( k ) ^n,k, P = 0/2, m = (mi,m 2 ,...) and is the vector with a 1 
in the ith co-ordinate and in all the rest. He has also shown [33J that, except in the 
cases of the star-shaped coalescent and Kingman's coalescent, the allelic partition is not 
regenerative in the sense of Gnedin and Pitman [TUj. Dong, Gnedin and Pitman [12] have 
studied various properties of the allelic partition of a general A-coalescent. In particular, 
they view the allelic partition as the final partition of a coalescent process with freeze 
(see Section [2] where we use this formalism) and also give an alternative description of q 
as the stationary distribution of a certain discrete-time Markov chain. 

Consider again the Beta coalescents. Suppose that we start the coalescent process from 
the partition of [n] into singletons. Let Nk(n) be the number of blocks of size k, for 
k > 1, and let N(n) be the total number of blocks, so that N(n) = J2l=i ^fc( n )- Then 
the complete allele frequency spectrum is the vector 

(Ai(n),A 2 (n),A 3 (n),...). 

In the case of a G (1,2), Berestycki, Berestycki and Schweinsberg [3111] have proved that 



n a - 2 N(n) 



2- a 



and, for k > 1, that 



,a-2* r ^ p pa(a-l) 2 T(k + a-2) 



n a - 2 N k (n) 



k\ 



as n — > oo. 
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The corresponding convergence results for Kingman's coalescent can be derived from the 
Ewens sampling formula: without rescaling, we have 

(N 1 {n),N 2 (n),...)S(Z 1 ,Z 2 ,...), 

where Z\, Z 2 , . . . are independent Poisson random variables such that Zi has mean l/i. 
It follows that 

N(n) a , s , ; 
logn 

as n — > oo and, moreover, that 

^-^^(0,1). 
Vlogn 

It is clear that the Beta coalescents belong to a completely different asymptotic regime. 

A related problem concerns the infinitely many sites model. Here, as before, we put 
mutations on the coalescent tree, but this time we imagine that we trace the genealogy 
of long stretches of chromosome from each of our n individuals. Each time a mutation 
arrives, it affects a different site on the chromosome. The number of segregating sites is the 
number of sites at which there exists more than one allele in our sample of chromosomes. 
This is simply the number of mutations on the skeleton of the coalescent tree. Let S(n) 
be the number of segregating sites when we start with a sample of n individuals. Clearly 
the distributions of S(n) and N(n) are related, in that in both cases we count mutations 
along the skeleton of the coalescent tree; for N(n), we discard any mutation which arises 
on a lineage all of whose members have already mutated. In [32], Mohle has studied 
the limiting distribution of S(n) in the special case where the measure x _1 A(<ix) is finite 
(which includes the Beta coalescents with a £ (0, 1)). He proves that 

S ^ d f°° < \ M m 
► p / exp(-at)dt, (1) 

n Jo 

where (<Jt)t>o is a drift-free subordinator with Levy measure given by the image under 
the transformation x i— ► — log(l — x) of the measure x~ 2 A(dx). 

The number of segregating sites is, in turn, closely related to the length of the coalescent 
tree (i.e. the sum of the lengths of all of the branches) and to the total number of collisions 
before absorption. This has been studied for various A-coalescents in (TTJ [131 Ell EHj. 



1.3 The Bolthausen-Sznitman allelic partition 



Turning now to the Bolthausen-Sznitman coalescent, Drmota, Iksanov, Mohle and Rosier 
[T3] have proved that 

S{n) -> p, 

n 

where S(n) is the number of segregating sites. They have also proved the corresponding 
"central limit theorem", 

S(n) - pa n 
pK 



5 



where a n = t 21 — h nl , og a° g " , b n = . " and S is a stable random variable having charac- 

" log n log n ' " log n ° 

teristic function 

exp ^— -vr|t| -Mtlogt^ . 

The purpose of this paper is to prove the following theorem concerning the complete 
allele frequency spectrum of the Bolthausen-Sznitman coalescent. 

Theorem 1.1. For k > 1, let N^n) be the number of blocks of the allelic partition of 
size k when we start with n singleton blocks. Then 

logn P 



n 



-iVi(n) 



and, for k > 2, 



(logn) 2 P p 

-N k (n) 



n " k(k — 1) 



As a corollary, we obtain that N(n), which is a priori smaller than S(n), has the same 
first-order asymptotics: 

logri p 

N(n) — > p. 

n 

Suppose that we start a general A-coalescent (U(t)) t >o from the partition of N into sin- 
gletons. Then it has been proved by Pitman [M] that either H(t) has only finitely many 
blocks for all t > ((U(t)) t > comes down from infinity) or H(t) has infinitely many 
blocks for all time ((Jl(t)) t >o stays infinite). See Schweinsberg [38J for an explicit cri- 
terion for when a A-coalcescent comes down from infinity, in terms of the A^'s. The 
fundamental difference between the Beta coalescents for a G (1, 2) and a G (0, 1] (includ- 
ing the Bolthausen-Sznitman coalescent) is that the former coalescents come down from 
infinity and the latter do not. This accounts for the fact that in Berestycki, Beresty- 
cki and Schweinsberg's result, the scalings are the same for all different sizes of block 
as n becomes large, whereas in our theorem, the singletons must be scaled differently. 
Essentially, coalescence occurs rather slowly and the overwhelming first-order effect is 
mutation, which causes the allelic partition to consist mostly of singletons. However, 
at the second order (i.e. considering (N 2 (n) , N 3 (n) ,...)) , we can feel the effect of the 
coalescence. 

We do not claim that our results are of any application in population genetics: to the 
best of our knowledge, the Bolthausen-Sznitman coalescent has not been used to model 
the genealogy of any biological population. Nonetheless, our method may extend to the 
case of coalescents which are more biologically realistic. 

Our method of proof is of some interest in itself. We track the formation of the allelic 
partition using a certain Markov process, for which we then prove a fluid limit (functional 
law of large numbers). The terminal value of our process gives the allele frequency 
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spectrum and the fluid limit result, after a little extra work, allows us to read off the 
asymptotics. 

Fluid limits have been widely used in the analysis of stochastic networks (see, for example, 
[8], [39]) and in the study of random graphs ([9], [36], [10]). In some sense, the prototypical 
result of the type in which we are interested is the following: suppose we take a Poisson 
process, (X(t)) t >o of rate 1, started from 0. Then the re-scaled process (N~ 1 X(Nt)) t > 
stays close (in a rather strong sense) to the deterministic function x(t) = t, at least 
on compact time-intervals. For a general pure jump Markov process, the fluid limit is 
determined as the solution to a differential equation. In this article we have relied on 
the neat formulation in Darling and Norris ^U\. However, our fluid limit is somewhat 
unusual. Firstly, instead of scaling time up, we actually scale it down, by a factor of 
logn. Moreover, we have three different "space" scalings for different co-ordinates of our 
(multidimensional) process. 



2 Fluid limit 



Consider the formation of the allelic partition, starting from the partition into singletons 
and run until every individual has received a mutation. The easiest way to think of 
this is to use the terminology of Dong, Gnedin and Pitman [12] in which blocks have two 
possible states: active and frozen. We start with all blocks active and equal to singletons. 
Active blocks coalesce according to the rules of the Bolthausen-Sznitman coalescent: if 
there are b active blocks present then any particular k of them coalesce at rate ^ k ~^-i)\ k ^' ■ 
Moreover, every active block becomes frozen at rate p and stays frozen forever (this act 
of freezing creates a block in the allelic partition). 

The data we will track are as follows. Let X%(t) be the number of active blocks of the 
coalescent partition at time t containing k individuals, k > 1, where we start at time 
with n active individuals in singleton blocks. For k > 1, let Z%(t) be the number of 
blocks of the allelic partition of size k which have already been formed by time t (this 
is the number of times so far that an active block containing precisely k individuals has 
become frozen). For d > 1, let Y£ +l {t) = YlkLd+i-^-k(^)i ^ ne num ber of active blocks 
containing at least d + 1 individuals. 

It is straightforward to see that, for any d > 1, 

X n 4{t) « (X «(t), X 2 "(t), . . . , JQ{t), Y£ +1 (t), Z n d {t))t>o 

is a (time-homogeneous) Markov jump process taking values in {0, 1,2,..., n} d+2 , with 

A7(0) = n, X fe "(0) = 0, 2<k<d, 17 +1 (0)=0, 3J(0) = 0. 
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Now put 



Xf(t) = -X? 1 1 



n \ log n J ' 



n 



logn 



n 



and 



yn (t) _ ^yn ( * 



logn 



logn 
for d > 1. 



for k > 2, 
for fc > 2 



Fix d > 1 and write 



= (xr(t),x 2 "(t),...,x«(t),F« +1 (t),^(t)) 

and define a stopping time 

T„ = inf{t > : X n ' d (t) = 0}. 
(Note that T n is the same regardless of the value of d.) 
For * > 0, let 



xi(t) = e \ 



x k (t) = 



te 



z l (t)=p(l-e- t ), z k (t) = 



and 



Vd+i(t) 



k(k-l 
P 

k(k-l) 



-, 2 < k < d, 



l-e-t-te-*), 2<k<d 



d 



X W (t) = (x 1 (t),x 2 (t),..., x d (t) ,y d+1 (t), z d (t) ) . 
We write || • || for the Euclidean norm on 

Proposition 2.1. Fix d > 1 and let t < oo. Then, given e > 0, 



P ( sup \\X n ' d (t) - x {d \t)\\ > -> 

\0<t<*o / 



as n ^ oo. 



This is the key to the following result. 
Proposition 2.2. Ta£;e 5 > 0. T/jen 

logn 



P 



n 



and, /or k > 2, 



as n — > oo. 



P 



(logn) 2 



n 



fc(fc- 1) 



><J -> 



> 5 



o, 
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Theorem 11.11 now follows directly, since Nk(n) = Z^(T n ) for k > 1. Note that Proposi- 
tion ETJ tells us how the allele frequency spectrum is formed. 

Remark. Delmas, Dhersin and Siri-Jegousse [TTJ have recently considered the lengths of 
coalescent trees associated with Beta coalescents for a £ (1, 2). Part (1) of their Theorem 
5.1 appears to be a result analogous to our Proposition 12.11 



3 Proofs 

In this section, we prove Proposition 12.11 and deduce Proposition 12.21 In order to do so, 
we use the fluid limit methodology described in Darling and Norris [TU]. Firstly, we need 
to set up some notation. Let (3 n ' d (m) be the drift of the process X n,d when it is in state 
m = (mi, m 2 , . . . , rrid+2) £ {0, n} d+2 , so that 

/T' d (m) = (m' -m)q n ' d (m,m'), 

where q n,d (m, mf) is the jump rate from m to m! . Let a n,d (m) be the corresponding 
variance of a jump, in the sense that 

a n,d (m) = \\ m ' ~ m\\ 2 q n ' d (m, m'). 

Let us also introduce the notation 

a k \ m ) = \ m k ~ m k\ q (m, m ), 

for 1 < k < d + 2, so that we may decompose a n,d (m) as 

d+2 

a n ' d (m) = ^ n k d (m). 

k=l 

Finally, let M = Ylk=i mk denote the total number of active blocks in the partition. We 
will need to compute the drift and infinitesimal variance of the re-scaled process X n,d , 
which takes values in the set 

fi MS/o,i...,lUo,i?i5,2^,...,k*4Vo,^,2 flS ^,...,0og»)'), 
[ n J [ n n J [ n n J 

where r = 1 if d — 1 and r = 2 if d > 2. Denote by /3 n ' d (£) and a n ' d (£) the drift and 
infinitesimal variance of X n,d when it is in the state £ = (£1,^2, • • • ,^+2) £ S n ' d . Then, 
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letting m = «i, ^6, • • • , ^d+i, fl^&H-a), we have 



k = 1 

2 < k < d + 1 



— n,d//-\ 



n log 

-5T — a^' d (m) k = 1 

log n J- \ / 

^o£'V) 2<k<d+l 
a d+2( m ) k = d + 2 



(logn) 2r 1 _ n,d 



and 



d+l 



fc=l 



Now define M d ) : R d+2 — > M a!+2 co-ordinatewise by 



bi d \0 



-6 fc = i 



p6 



fe = d + 1 
jfc = d + 2. 



(2) 



Then the vector field is Lipschitz in the Euclidean norm with constant K = yj p z + -y . 

The function x^(t) of the previous section is the unique solution of the differential 
equation 

= b^ d \x (d) (t)). 

Lib 



In order to prove Proposition 12.11 we need a few lemmas. Firstly, we prove two analytic 
results. For neN, let h(n) = Yli=i ?> the ( n — l)th harmonic number. 

Lemma 3.1. Fix R > e. Then for x G -Z n 1], 



h(nx) 



log n 



1 



< 



logn 



Proof. It is an elementary fact that, for k > 2, 

log(Jfe) < h(k) < 1 + log(fc - 1) < 1 + log(Jfe). 

This entails that 



h(nx) 



logn 

in the specified range of x. 



1 



log(z) l + log(a;)\ logi2 
< max < . — : > < 



log n log n 



logn 



□ 
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Lemma 3.2. For < j < n and k > 0, 



1 -i4- kJ 



( n f) ~ n-j + 1 
Proof. We have 



j'-i 



lQ g ( ) = - J2^ n -* + *)- - 0)- 



i=0 



By the mean value theorem, 



k 

login — i + k) — login — i) < , < i < n — 1. 

n — i 

Hence, 

X>g(n -< + *)- Iog(n - <)) < £ < ^z^Y 



7 + 

i=0 i=0 J 



and so 



log I TT^V I > kl 



( n f) J ~ n-j + 1 



Since 

exp ; > 1 — 



n — j + 1 J n — j + 1 ' 

the result follows. □ 

We now have the necessary tools to begin proving the fluid limit result. 
Fix R > e and let l(n, R, d) = R~ l + d/n and 

S n ' d = |e e ^ : 6 > Z(n, < ^| ■ 

Let 

T^' 1 = inf {t > : < Z(n, i2, d)} , 

T,f' d ' 2 = inf {t > : > R} 

and set = T™ 1 A T^- 2 . 

Lemma 3.3. For £ e <S n,d ; i/iere exzsfo a constant C{R), depending only on R, such that 



P M (0-^(0II<^- 

lOg fi 



follows that for t < oo, 



f \\^ d (X n ' d (t)) - bW(X n ' d (t))\\dt < 

Jo log » 
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Proof. We must perform some elementary (but rather involved) calculations. From the 
rates of the process we will calculate the co-ordinates of f3 n,d (m) in turn. Recall first that 
if M active blocks are present in the partition, the next event involves the coalescence of 
precisely j of them at rate ( j J Xmj = jtjzi) • Thus, we have 

M j^j- mi pi j | M-mi \ 

ffV) = - gj(j^T)X> - rff 

= —prrix — m\h (M) . 

For 2 < k < d, 

M m fc /m fc \ /A/-m fc \ 

R n ' d ( m\ - nrr, — 7i ^ fc ' 

Pfc (m) - -pm fe - 2^ , _ n 2^ 6fc (aT\ 

i=2 ju ^ 6fe=1 ^ j J 

JW > 0<bi.bo fe,,_i<i V 3 ) 



j=2 J ^ ' 0<6i,6 a ,...,6 fc _i<j 

fc , , / mi \ / m-k-l 



-pm fc - m fc fr (M) + > ^ 2^ Tm) 



j=2 JW ' 0<b 1 ,6a,-,6fc_i<j 



For the (d + l)th co-ordinate we have 



M m d+l /m d+ l\/M-m #1 



j=2 J ^ ' b d +i=l 



- jw ; n<6i.h, 6 J <i- v j ) 



j=2 J ^ ' 0<b!,b2,...,b d <j 

Ef=i»i>d4-l,E?=i b i=i 

M „ , / mi \ / md+i 



-pm d+1 — m d+1 h (M) + ^ _ ^ TIT) 

<•> \ 3 J 



j=2 J ^ ' 0<6i,6 2 ,...,6 ( , + i<j 

Efi 1 l6i>*fl,ES l &!=j 



Finally, 



Using (j2J) and the notation m = (mj, . . . , rrid+2), we obtain the following expressions: 

Pi \<, — u ] > 

log n log n 
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for 2 < k < d, 

Pfc logn^ logn n^j'(j'-l) ^ 



Ei=i»i=^.Ei=i 6i=i 



sm p_ t _ (M) , I V" M ("i 1 ) •••("/+ 1 1 ) 

logn^ +1 logn + n^j'(j'-l) ^ (*f) 

E& 1 1 I6i>«H-l,Efe 1 6i=j 

ffi-sto = PU 



Bearing in mind that M = n ^ + Sf=2 4y > an< ^ us i n g Lemma I3~T} we get 



i/3r ,d (o-&Hoi< 



(p + logR) 
logn 
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Consider now the sum in the expression for f3^' d {^) when 2 < k < d. We split it into two 
parts, j = k and 2 < j < k — 1. The j = k term is 



£l + logn Sj=2 £i ( "I 1 ) 

k(k-l) ( <i + t^ES& 



By Lemma [3.21 we have 



fe(fc-l) f< 1+ ^Eg^ fc(fc-l) 



6 



< 



6 



logn \ £i — d/n 



d+l 
i=2 



Turning now to the other term, if 2 < j < k — 1, we have 



E 



/ mi \ / "ifc-i \ /mi 
Ui J ■ • • V 6fc-W < 1 _ I i 



j Xli=2 & 

(f ) logn -d/n) 



0<6 1 ,6 2 ,...,6 fc -i<; 



(1) 



< 



and so 

1 ^ M 
n ^— ' 7(7 — 1) 

E?=i *«>!=fc>E, fc =i &«=i 



E 



/ran / m fc-i \ 1 

Ui J ■ • • v 6fe-w < 1 



log n 1 1T lo. 



; n 



d+i 

i=2 



2^i=2 St 

£1 — d/n 



h(d). 



With another application of Lemma 13.11 it follows that 
\Pl'\0-b { k\®\ 



<i \{ P + \ogR)t k + 1 + 

logn 



^)E^+^f>)| 



d/n 



h(d) 
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We turn finally to the expression for /3^+i(0- Consider the sum which constitutes the 
third term. We have 



E 



/ mi \ / m-d+i \ 

Ui ) • • • V fed+i J 



0<bi,fe 2 ,---,bd+i<j 



(?) 



1 - 



E 



/ mi \ / m d+1 \ 
Ui J • • ' I b d+1 ) 



0<b 1: b2,...,b d+1 <j 

T,t^ib l <d,j:f+^b l =j 

d 

= i-E E 

k=2 0<b u b 2 ,...,b d+1 <j 

Efi 1 ih=k,Y,tl h=j 



(1) 

/ mi \ / m d+ i \ 

Ui > ■ ■ ■ { b d+l ) 



(1) 



But then 



E 



E 



n ^— ' 7(7 — 1) 

j=2 JVJ ' 0<6i,6 2 ,...,6 d+ i<j 

Efi 1 ih>d+i,T,tl b r - 



, mi n / m d+ i \ 
(1) 



1 

n 



\ 



M (7) 



d k-l 

-EE 



M 



E 



/ mi \ / "ifc-i \ 
Ui ) • • • V b k -i ) 



\ 



(?) 



E«=i »i=*.E«=i 6i=j 



and so, arguing as before, we obtain 



yn.d , , , Ad) - 1 1 



l/O0-^i(0l<- + 



n logn 



(P + log 



Sj=2 



— d/n 



dh(d) 



+ d 1 + 



6 



fi - d/ra 



(i+l 

Ef< 

i=2 



It is clear that 



1^2(0-^2(01 = 0. 



Putting everything together, we obtain that 

P M (0-^ } (OII < 



C(R) 
logn ' 



for some constant C(R), whenever £ e <S n ' d . The final deduction follows easily. 



□ 



Lemma 3.4. Fix R > e. Then there exists a constant C'(R), depending only on R, such 
that for i E S n ' d , 



« M (0 < 



cm 

logn 



It follows that for to < 00, 



I 



T*' d M 



a n4 (X t )dt < 



C'(R)t 
logn 
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Proof. Recall that for 1 < k < d + 2 we have 

a k' d ( m ) = \ m 'k ~ m k\ 2( ! n,d ( m i m '), 

so that 

d+2 

fc=i 

We will deal with the co-ordinates in turn. 

M 1\/T mi ( mi \ ( M ~ m l \ 

= pmi + mi (mi _ 1) + mih(M). 

Hence, 

a l (?) = -5"; a l ( m ) < i 1 

n 2 logn logn n 

for some constant Ci(R). For 2 < k < d, 

M 

fc 



n,d 

a 



m k (m k \ ( M~m k \ 

( m ) =pmfc + ^ N 6 fc . 



,f M ^ (r)---(r- 1 ) 

JVJ 7 0<6i.bo b t _,<i V J / 



j=2 J ^ ' 0<6i,b2,-,6fc-i<J 



pm k + m k (m k - 1) + m k h(M) + ^ , M - bl - ^ 6fc 1 - 



Hence, 



j=2 J ^ ' 0<6i,62,-,6fc-i<J 



fe n 2 fe ~~ logn n 



for some constant C k (R). Furthermore, 

M m d+1 ,m d+1 , (M-rn d+1 \ 

o^iM = pm d+1 + ^ tt— — 2^ ( b d+i ~ 1) /In 

i=2 ^ J bd+1= i I j J 

j=2 JKJ ' 0<bi,b2,...,b d <j V J ) 

T,t=iih>d+i,T,t=i b i=i 
< pm d+1 + m d+1 (m d+1 - 1) + m d+1 h(M) 

,f j£ v (£)••• (%) 

jr=2 J w 7 0<6i,b 2 ,...,6 d <j V J ; 

Ef=i»j>«n-i.Ef=i6i=j 
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So we also get 

-nA (c \ lo § n n,d i fd+i . C d+1 (R)\ogn 

d+i n 2 d+1 logn n 

for some constant C d +i{R)- Finally, we have 

n,d / \ 

a d+2\ m ) = P m d- 

So 

-n,d / t x P£d logri _ M , . p^d(logn) 2 

«d; 2 (0 = if d = 1 and a d ; 2 (0 = if d > 2. 



Hence, 



a (0 < 



logn n 

for some constant D(R). Since ||£|| 2 < (1 + R) 2 in <S" ,d , we obtain 

logn 

for some constant C'(R) depending on R. □ 
Proof of Proposition 12.11 

We will, in fact, prove the stronger result that, for any d > 1 and any < 5 < 1, 

P f sup \\X n > d (t) -x {d \t)\\ > (logn)^ -> (3) 

\0<t<to / 

as n — > oo. Fix d > 1. We follow the method used in Theorem 3.1 of Darling and 
Norris [TU] and start by noting that X n,d (t) has the following standard decomposition 

ft 

-n,d(+\ — vn,d( n \ , n*n,d( + \ , / nn,d/yn,d/ 



X»*(t) = X n ' d (0) + M n ' d (t) + / /3 n ' d (X n ' d (s))(is, (4) 
where (M n,d (t)) 4 >o is a martingale in the natural filtration of X n,a! . Since 



.r 



and X"' d (0) = x ((i) (0) for all ii 6 N, we have 
sup ||X n - d (s) 

0<s<t 



0<s<* 



< sup \\M n ' d (s)\\+ / \\p n > d (X n < d (s)) -b {d \X n > d (s))\\d 



s 



+ / \\b^(X n ' d (s))-b^(x^(s))\\ds. (5) 
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Recall that K is the Lipschitz constant of b^ d \ Fix R > e and let 



■T*' d At 



jf ||^ d (X^(t)) - < -flog,,) ^ < ' 

^n,d,2 = I sup ||M^(t)|| < h\ogn)^e- Kt0 

{o<t<T^' d At Z 



and 



[Jo logn 
From ([5]) we obtain that for t < T^ ,d A to an d on the event fl Ut d,i H ^n,d,2, 

sup ||X n ' d (s) -a; (c0 (s)|| < (logn)^e~ x *° + X / sup ||X n ' d (r) - x (d) (r)||ds. 

0<s<i JO 0<r<s 

Hence, by Gronwall's lemma, 

\X n ' d (t) -x {d \t)\\ < (logn) 

Now, by Doob's L 2 -inequality, 



SUp M \ > - r""lt ) II < ( ln<> n.|T 

0<t<T^' d At Q 



sup ||AT' d (t)|| 2 

_0<t<T,?' d Ato 



T,?' d Ato 

a n ' d (X n > d (s))cis 



< 4E [||M n ' rf (T^ d A t ) || 2 ] < 4E 
Combined with Chebyshev's inequality, this tells us that 

Pf *ip l|M"' d (t)|| > l(logn)^e- K *,n nA3 ) < 16C ^y Kt ° 



\0<t<T„' d At 



Hence, P (£L n> d,2 \ ^n,d,z) 0. By Lemmas 13.31 and 13.41 we have P (n n ,d,i) "~ ^ 1 an d 
P (Q n ,d,3) —* 1 as " — » oo. But 

p ( sup ||x^(t) - xW(t)|| > (bgn)^ ) < 16C ^°f yt ° + p (nu u n^) , 



\0<t<T^' d At 



which clearly tends to as n — > oo. 

In fact , we wish to prove this result for to rather than ,d A to . Set 

n n ,R,d = \ sup ||X"' d (t) - x W(t)\\ < (logn)^ 

Since T^ d = T^ 1 A T^' d ' 2 , it will suffice for us to show that T^ 1 > t and T*> d > 2 > t 
on f2 n R d for all large enough n and R. 
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Note firstly that x\(t) > for all t > and that xi(t) decreases to as t — > oo. Take n 
and R large enough that xi(t ) > (logn)" + l(n, R, d). Then on £l n ,R,d, 



inf 



0<t<Tn ,dll Ato 



\X?(t)\ > inf 

> Xi(t ) 



0<i<T^' d,1 Ato 



x 1 {t)-\X?(t)-x 1 (t)\ 
sup \\X n ' d (t)-x^(t)\\ 



Q<t<Tn' d /\t 



> l(n, R, d). 

Note now that < y2{t) < e _1 < R for all t > 0. Take n to be sufficiently big that 
(logn)" + e" 1 < i?. Then on {l n ,R,d, we have 

sup \Y?{t)\ < sup 



0<t<T* ,d,3 Ato 0<t<T* M At 

The desired result (El) follows. 



\X n ' d (t)-x^(t)\\ + sup |y 2 (t)|<i?. 

0<t<t o 



Proof of Proposition 12.21 

For convenience we will write 

Zi(oo) = lim Z\{t) = p and, for k > 2, Zfc(oo) = lim #*(£) = .J* — -. 

Since we can take to arbitrarily large, we can make Zd(to) arbitrarily close to Zd(oo). 
With high probability, on the time interval [0,to\, ^p-^i^j^) stays close to zi(t) and, 

likewise, for d > 2, O ^) ^(_1_) stays close to 2<z(t). So the work of this proof will 
be to demonstrate that Z^(t) does not do anything "nasty" between times and T n . 
(Note that this interval is potentially quite long: absorption for the coalescent takes place 
at about time log log n; see Proposition 3.4 of [25].) We will have to split the interval 

j^^,T n into two parts and deal with process separately on each. 
The statement of Proposition 12.21 fixes some 5 > 0. Take also rj > and fix to such that 

2zi(*o) + 2/2 (*o) < 

and, for k > 1, \z k (t ) - z k (oo)\ < -. 

(We can do this uniformly in k because of the special form of the functions xi(t), yi (t) 
and z k {t), k > 1.) Take < e < f A A xi(£ Q ). Let 



fi M = ( sup \\X^ d (t)-x^(t)\\ <e\ 
Lo<Kto J 



Since (logn)^'^' 1 < T n and e < xi(to), we know by the argument in the proof of 
Proposition 12.11 that T n > on Q n> i for all sufficiently large n and R. Let 

r n = inf \t > : X?(t) < n ,„ and F 2 n (t) < " 



logn (logn) 3 " (logn) 
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We will first deal with the time interval 



logn' n l 



Lemma 3.5. For sufficiently large n, we have 



and, for k > 2, 



logn 



E 



n 



(logn) 



-E 



n 



to 



logn 



to 
logn 



In 



< b A 
~ 6 



(6) 



6 



(7) 



Proof. Consider a new process (Xf(t), Y" 2 n (t))t> which starts from (X™(t ), ^T(^o)) and 
has the same dynamics as (X{ 1 (t), F 2 ™(t)) t >o, except with a stochastic time-change which 
means that time is now run at instantaneous rate (X™(t) + F 2 n (i)) _1 . 111 other words, if 



U n (s) 



and 
then 



I X n ( to±u\ + logn / to + n 
n \ logn / n \ logn 



\/ n (t)=inf{s>0:f/ n (s)>t} 



cin 



1 w n 1 V. log n y 2 w n 2 \ log n 



Let f n = C/"((logn)r„ - t ) = inf {t > : X»(t) < and f 2 "(t) < j^}- Then we 

have X?(f n ) = ±X?(T n ) and K»(f n ) = ^Y 2 n (r n ). 

The process (X^(t), Y" 2 n (t))t>o has drift vector /3 n (0 in state £, where 

p6 &M«6 + sfefc) 



#(0 



(6 +6) logn (6 + 6) logn 



p6 



+ 



Let 



(6 + 6) log n (6 + 6) log n (6 + 6) 

A n (t) = 2X 1 "(t) + Y?{t) +t, t>0. 



Then A n (t) has drift 
2^(0+^(0 + 1 



6 + 6 



logn 



1 + 



p(26 + 6) 



log n 6 + 6 (6 + 6) log n n(6 + 6) ' 



in state 6 Intuitively, this is small for large n and so (A n (t)) t >o is almost a martingale. 
More rigorously, we have 



2/3r(0+/3 2 "(0 + 



x< 26 + 6 A _hH 1 +^y 



6 + 6 



logn 



+ 



6 



logn 6 +6 
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Lemma [3J] remains true if we replace R by (logn) 3 . So, since , < 1 in iS n,d , we 
obtain 

2«'K) + ! < 6l0gl0E ' ! + 1 



logn 



whenever £ e <S M and 6 + G -Z fl 

^ 1 log 7i ^ z n 



(log n) 



3 • 



By the same standard decomposition as at (HI), there exists a zero-mean martingale 
(M n (t)) t > such that 

A n (t) = A n {0) + M n (t) + / {2^{X n {s)) + ffi{X n {s)) + l)ds. 

Jo 

Fix ti > 0. For any particular n, A n (t) and J*(2f3?(X n (s)) + p%(X n (s)) + l)ds are 
bounded on the time interval [0, tx] and so we may apply the Optional Stopping Theorem 
to obtain that 

E 



(2X?(T n A U) + Y 2 n (f n A tx) + (f n A tx) ) ]io„ 

= E[A n (f n A* 1 )ln n J 



E^O)!^] + E 



f n Ati 



(2/3 1 n (X"(t))+/3™(X"(t)) + l)^ln„, 1 



E[(2X[ l (to) + F 2 "(t ))l^, 1 ] +E 



Tn Ail 



(2^»(t))+^(-X n (t))+l)dtl nn . 1 



Uo 



We have that X™{r n A tx) and F 2 "(r n A £i) are both non- negative and so 

6 log log n + 1 \ 1 



E [(f n A < fl- 



logn 

<2(2xi(t )+y 2 (to) + 3e) 
Sr] 
6p' 



E[(2X 1 B (t )+iT(*o)) ln nil ] 



since, for large enough n, ( 1 — 61 ° g lo ° g ra n+1 



is bounded above by 2 and we have assumed 

that 2xi(t ) + 2/2(^0) < 2?" anc ^ e < nn- Letting tx | 00, we obtain by monotone 
convergence that 



72p' 



E [fnln.J < ^. 



Now, by a further application of the Optional Stopping Theorem and monotone conver- 
gence, 



logn 



E 



n 



Z n x{r n )~Zl 



tr 



logn 



logn 



E 





log 71 



E 



P x?(t)dtu n>1 

pXi(s) 



l X?(s)+Y?(s) 



11 11 71, 1 
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by changing variable in the integral. Similarly, for k > 2, 



(logn) 



-E 



Z n k {T n )-Z% 



logn 



The result follows. 

From © and ([7]) and Markov's inequality, 

P 



(logn) 



< 





n 


(logn) 2 




n 






E 






Jo 


pE [f n l 



E 



E 



log n 
Tn 



*Q_ 

log n 



pX%(t)dtln nA 

pY^{ S )dstn l 



pY 2 n (s) 



-dstn . 



3 logn^ 

< ^E 

no 



logn 

Z^n) ~ Zl 



logn 



□ 



and, for A; > 2, 



P 



(logn) 



3?(t») " Z 



to 

" ' logn 



< 3(logn) 2 E 
n# 



^0 



logn 



<2. 



Note that we necessarily have r n < T n . Since Z%(t) is increasing for all k > 1 and 
Z£(T n ) - Z£(r n ) < Xf(r n ) + KT(r n ) < for all fc > 1, we have that 

] " gn {Z»{T n )-Z?{r n ))< 2 



n 



and, for k > 2, 



(logn) 
n 



(^(T n )-Z fe "(r„))< 



(logn) 2 
2 



logn 



For n > exp(|) these quantities are both less than ~. On fi n)1 we have 



7? 



logn 



5 

< -. 

~ 3 



By taking n sufficiently large, we have by Proposition 12.11 that P (Q^ i) < f anc ^ so we 
conclude that 

f( l ^Z?(T n ) - Zl (oo) >5) <n. 



n 
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Now consider the case d > 2. On fi n ^ we have 



(logn) 



71 



tc, 



logn 



Zd{to) 



< 



and, by taking n sufficiently large, we have by Proposition ^. ll that P (fi^ 1 )+P (^n,d) < \- 
Hence, 

(logn) 2 



P 



n 



-Z n d {T n ) - z d {co) 



> 5 ) <rj. 



But 7] was arbitrary and so this completes the proof of Proposition 12.21 



4 Comments 



4.1 Asymptotic frequencies 

It would be very interesting to have a better understanding of the distribution of the 
asymptotic frequency sequence of the allelic partition associated with the Bolthausen- 
Sznitman coalescent. In [IB] . Gnedin, Hansen and Pitman obtain relations between the 
total number of blocks N(n) of an exchangeable random partition restricted to the set 
{1, . . . ,n} and the asymptotic form of the sequence (/$ )i>i- More precisely, they prove 
that, for any a G (0, 1) and any function I : R + — > M + , slowly varying at infinity, we have 



N(n) „,. , _ //{/> i :/; >■>■} a., , ni _ /: 



T(l - a)n a £(n) ' £(l/x) X - a ' &S X ~* t*(i)i-V°< ' ' 

where is also a slowly varying function which can be expressed in term of a and £ 

It would be nice to have a similar result for the allelic partition associated with the 
Bolthausen-Sznitman coalescent. There are, however, two main difficulties: first, we 
would need almost sure convergence of the rescaled process N(-), whereas here we have 
only established convergence in probability. Second, the Bolthausen-Sznitman coalescent 
corresponds to the critical case a = 1 for which the first of the above equivalences 
no longer holds. In this setting, according to Proposition 18 of Gnedin, Hansen and 
Pitman [T8] , we have only the implication: 

x(logx) 2 #{i >l://>x}^/)asx^0+^> l -^N(n) ^ p 

and, in addition, that 

logn a .s. , (logn) 2 as p 

-iVi(nj — ► p and N k {n) — ► — -, k > 2. 



n n 



k(k- l) v 



The form of the limits is, of course, basically the same as in our Theorem 11.11 and so we 
might expect to find that // ~ i(iogi) 2 as * tends to infinity. 
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4.2 Beta coalescents 



The fluid limit methods used in this paper can, in principle, be extended to deal with 
other classes of coalescent process. For instance, the method seems to work for the Beta 
coalescents with parameter a G (1,2). However, the calculations are more complicated 
than in the Bolthausen-Sznitman case. Indeed, for the Bolthausen-Sznitman coalescent, 
the active partition is mostly composed of singletons at any time, which essentially enables 
us to neglect collisions between non-singleton blocks. This approximation does not hold 
for the Beta coalescents with a G (1, 2). Since the relevant result has already been proved 
by Berestycki, Berestycki and Schweinsberg [31 H] by other methods, we will not give the 
details. 

We may also consider the Beta coalescents with parameter a G (0, 1). Mohle's result (CQ) 
that the total number of mutations along the coalescent tree, re-scaled by n, converges in 
distribution to some non-degenerate random variable suggests that here we may expect 
to have convergence in distribution of the allelic partition to a random vector. Clearly, 
the fluid limit methods used in the present paper do not adapt to this situation, but we 
can still use them to investigate the expected value of the number of blocks of different 
sizes. Indeed, the drift of the re-scaled process 

( *m ?m \ if a- 1 

\ n ' n a ' n J U. U — X 

X£W *m *d"+i(*) Z2(t)\ ■ fH>9 

y n ' n a ' - - - ' n<* ' n a ' n a J — 

converges to an explicit function fe w (but the variance a n ' d does not tend to 0). This 
enables us to conjecture that 

iVi(n) ~ C x n and N k (n) ~ C k n a for k > 2, 

where Ci, C2, ■ ■ ■ are strictly positive random variables. We intend to address this problem 
in a future paper. 
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